deterministic planner
From CAD to POMDP: Probabilistic Planning for Robotic Disassembly of End-of-Life Products
Baumgärtner, Jan, Hansjosten, Malte, Hald, David, Hauptmannl, Adrian, Puchta, Alexander, Fleischer, Jürgen
Abstract-- T o support the circular economy, robotic systems must not only assemble new products but also disassemble end-of-life (EOL) ones for reuse, recycling, or safe disposal. Existing approaches to disassembly sequence planning often assume deterministic and fully observable product models, yet real EOL products frequently deviate from their initial designs due to wear, corrosion, or undocumented repairs. We argue that disassembly should therefore be formulated as a Partially Observable Markov Decision Process (POMDP), which naturally captures uncertainty about the product's internal state. We present a mathematical formulation of disassembly as a POMDP, in which hidden variables represent uncertain structural or physical properties. Building on this formulation, we propose a task and motion planning framework that automatically derives specific POMDP models from CAD data, robot capabilities, and inspection results. T o obtain tractable policies, we approximate this formulation with a reinforcement-learning approach that operates on stochastic action outcomes informed by inspection priors, while a Bayesian filter continuously maintains beliefs over latent EOL conditions during execution. Using three products on two robotic systems, we demonstrate that this probabilistic planning framework outperforms deterministic baselines in terms of average disassembly time and variance, generalizes across different robot setups, and successfully adapts to deviations from the CAD model, such as missing or stuck parts. I. INTRODUCTION Modern industrial production still follows a linear model of make-use-dispose, accelerating the depletion of natural resources on our planet.
- Europe > Germany > Baden-Württemberg > Karlsruhe Region > Karlsruhe (0.04)
- Europe > Switzerland (0.04)
A Probabilistic Forecast-Driven Strategy for a Risk-Aware Participation in the Capacity Firming Market
Dumas, Jonathan, Cointe, Colin, Wehenkel, Antoine, Sutera, Antonio, Fettweis, Xavier, Cornélusse, Bertrand
This paper addresses the energy management of a grid-connected renewable generation plant coupled with a battery energy storage device in the capacity firming market, designed to promote renewable power generation facilities in small non-interconnected grids. A recently developed deep learning model known as normalizing flows is used to generate quantile forecasts of renewable generation. They provide a general mechanism for defining expressive probability distributions, only requiring the specification of a base distribution and a series of bijective transformations. Then, a probabilistic forecast-driven strategy is designed, modeled as a min-max-min robust optimization problem with recourse, and solved using a Benders decomposition. The convergence is improved by building an initial set of cuts derived from domain knowledge. Robust optimization models the generation randomness using an uncertainty set that includes the worst-case generation scenario and protects this scenario under the minimal increment of costs. This approach improves the results over a deterministic approach with nominal point forecasts by finding a trade-off between conservative and risk-seeking policies. Finally, a dynamic risk-averse parameters selection strategy based on the quantile forecasts distribution provides an additional gain. The case study uses the photovoltaic generation monitored on-site at the University of Li\`ege (ULi\`ege), Belgium.
Solving Goal Hybrid Markov Decision Processes Using Numeric Classical Planners
Teichteil-Königsbuch, Florent (ONERA)
We present the domain-independent HRFF algorithm, which solves goal-oriented HMDPs by incrementally aggregating plans generated by the Metric-FF planner into a policy defined over discrete and continuous state variables. HRFF takes into account non-monotonic state variables, and complex combinations of many discrete and continuous probability distributions. We introduce new data structures and algorithmic paradigms to deal with continuous state spaces: hybrid hierarchical hash tables, domain determinization based on dynamic domain sampling or on static computation of probability distributions' modes, optimization settings under Metric-FF based on plan probability and length. We compare with HAO* on the Rover domain and show that HRFF outperforms HAO* by many order of magnitudes in terms of computation time and memory usage. We also experiment challenging and combinatorial HMDP versions of benchmarks from numeric classical planning, with continuous dead-ends and non-monotonic continuous state variables.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Europe > France > Occitanie > Haute-Garonne > Toulouse (0.04)
Anticipatory On-Line Planning
Burns, Ethan (University of New Hampshire) | Benton, J. (Graduate Student, Arizona State University) | Ruml, Wheeler (University of New Hampshire) | Yoon, Sungwook (Palo Alto Research Center) | Do, Minh B. (NASA Ames Research Center)
We consider the problem of on-line continual planning, in whichadditional goals may arrive while plans for previous goals are stillexecuting and plan quality depends on how quickly goals are achieved.This is a challenging problem even in domains with deterministicactions. One common and straightforward approach is reactive planning,in which plans are synthesized when a new goal arrives. In this paper,we adapt the technique of hindsight optimization from on-line schedulingand probabilistic planning to create an anticipatory on-line planningalgorithm. Using an estimate of the goal arrival distribution, wesample possible futures and use a deterministic planner to estimate thevalue of taking possible actions at each time step. Results in twobenchmark domains based on unmanned aerial vehicle planning andmanufacturing suggest that an anticipatory approach yields a superiorplanner that is sensitive not only to which action should be executed,but when.
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- North America > United States > New Hampshire (0.04)
- North America > United States > Arizona > Maricopa County > Tempe (0.04)
- Government > Regional Government > North America Government > United States Government (0.68)
- Government > Space Agency (0.46)
Fast Incremental Policy Compilation from Plans in Hybrid Probabilistic Domains
Teichteil-Königsbuch, Florent (ONERA)
We present the domain-independent HRFF algorithm, which solves goal-oriented HMDPs by incrementally aggregating plans generated by the METRIC-FF planner into a policy defined over discrete and continuous state variables. HRFF takes into account non-monotonic state variables, and complex combinations of many discrete and continuous probability distributions. We introduce new data structures and algorithmic paradigms to deal with continuous state spaces: hybrid hierarchical hash tables, domain determinization based on dynamic domain sampling or on static computation of probability distributions' modes, optimization settings under METRIC-FF based on plan probability and length. We deeply analyze the behavior of HRFF on a probabilistically-interesting structured navigation problem with continuous dead-ends and non-monotonic continuous state variables. We compare with HAO* on the Rover domain and show that HRFF outperforms HAO* by many order of magnitudes in terms of computation time and memory usage. We also experiment challenging and combinatorial HMDP versions of benchmarks from numeric classical planning.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Europe > France > Occitanie > Haute-Garonne > Toulouse (0.04)
SixthSense: Fast and Reliable Recognition of Dead Ends in MDPs
Kolobov, Andrey (University of Washington, Seattle) | Mausam, ' (University of Washington, Seattle) | (University of Washington, Seattle) | Weld, Daniel
The results of the latest International Probabilistic Planning Competition (IPPC-2008) indicate that the presence of dead ends, states with no trajectory to the goal, makes MDPs hard for modern probabilistic planners. Implicit dead ends, states with executable actions but no path to the goal, are particularly challenging; existing MDP solvers spend much time and memory identifying these states. As a first attempt to address this issue, we propose a machine learning algorithm called SIXTHSENSE. SIXTHSENSE helps existing MDP solvers by finding nogoods, conjunctions of literals whose truth in a state implies that the state is a dead end. Importantly, our learned nogoods are sound, and hence the states they identify are true dead ends. SIXTHSENSE is very fast, needs little training data, and takes only a small fraction of total planning time. While IPPC problems may have millions of dead ends, they may typically be represented with only a dozen or two no-goods. Thus, nogood learning efficiently produces a quick and reliable means for dead-end recognition. Our experiments show that the nogoods found by SIXTHSENSE routinely reduce planning space and time on IPPC domains, enabling some planners to solve problems they could not previously handle.
- North America > United States > Washington > King County > Seattle (0.14)
- North America > United States > Oklahoma > Payne County > Cushing (0.04)
Improving Determinization in Hindsight for On-line Probabilistic Planning
Yoon, Sungwook (Palo Alto Research Center) | Ruml, Wheeler (University of New Hampshire) | Benton, J. (Arizona State University) | Do, Minh (Palo Alto Research Center)
Recently, "determinization in hindsight" has enjoyed surprising success in on-line probabilistic planning. This technique evaluates the actions available in the current state by using non-probabilistic planning in deterministic approximations of the original domain. Although the approach has proven itself effective in many challenging domains, it is computationally very expensive. In this paper, we present three significant improvements to help mitigate this expense. First, we use a method for detecting potentially useful actions, allowing us to avoid estimating the values of unnecessary ones. Second, we exploit determinism in the domain by reusing relevant plans rather than computing new ones. Third, we improve action evaluation by increasing the chance that at least one determin- istic plan reaches a goal. Taken together, these improvements allow determinization in hindsight to scale significantly better on large or mostly-deterministic problems.
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- North America > United States > New Hampshire (0.04)
- North America > United States > Arizona > Maricopa County > Tempe (0.04)
Classical Planning in MDP Heuristics: with a Little Help from Generalization
Kolobov, Andrey (University of Washington, Seattle) | Mausam, . (University of Washington, Seattle) | Weld, Daniel S. (University of Washington, Seattle)
Computing a good policy in stochastic uncertain environments with unknown dynamics and reward model parameters is a challenging task. In a number of domains, ranging from space robotics to epilepsy management, it may be possible to have an initial training period when suboptimal performance is permitted. For such problems it is important to be able to identify when this training period is complete, and the computed policy can be used with high confidence in its future performance. A simple principled criteria for identifying when training has completed is when the error bounds on the value estimates of the current policy are sufficiently small that the optimal policy is fixed, with high probability. We present an upper bound on the amount of training data required to identify the optimal policy as a function of the unknown separation gap between the optimal and the next-best policy values. We illustrate with several small problems that by estimating this gap in an online manner, the number of training samples to provably reach optimality can be significantly lower than predicted offline using a Probably Approximately Correct framework that requires an input epsilon parameter.
- North America > United States > Washington > King County > Seattle (0.14)
- North America > United States > Oklahoma > Payne County > Cushing (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.68)
Lower Bounding Klondike Solitaire with Monte-Carlo Planning
Bjarnason, Ronald (Oregon State University) | Fern, Alan (Oregon State University) | Tadepalli, Prasad (Oregon State University)
Despite its ubiquitous presence, very little is known about the odds of winning the simple card game of Klondike Solitaire. The main goal of this paper is to investigate the use of probabilistic planning to shed light on this issue. Unfortunatley, most probabilistic planning techniques are not well suited for Klondike due to the difficulties of representing the domain in standard planning languages and the complexity of the required search. Klondike thus serves as an interesting addition to the complement of probabilistic planning domains. In this paper, we study Klondike using several sampling-based planning approaches including UCT, hindsight optimization, and sparse sampling, and establish lower bounds on their performance. We also introduce novel combinations of these approaches and evaluate them in Klondike. We provide a theoretical bound on the sample complexity of a method that naturally combines sparse sampling and UCT. Our results demonstrate that there is a policy that within tight confidence intervals wins over 35% of Klondike games. This result is the first reported lower bound of an optimal Klondike policy.
- North America > United States > Nevada > Clark County > Las Vegas (0.04)
- North America > United States > Oregon > Benton County > Corvallis (0.04)
- North America > United States > New Jersey (0.04)